2,224 research outputs found
Naive Bayes vs. Decision Trees vs. Neural Networks in the Classification of Training Web Pages
Web classification has been attempted through many different technologies. In this study we concentrate on the comparison of Neural Networks (NN), Naïve Bayes (NB) and Decision Tree (DT) classifiers for the automatic analysis and classification of attribute data from training course web pages. We introduce an enhanced NB classifier and run the same data sample through the DT and NN classifiers to determine the success rate of our classifier in the training courses domain. This research shows that our enhanced NB classifier not only outperforms the traditional NB classifier, but also performs similarly as good, if not better, than some more popular, rival techniques. This paper also shows that, overall, our NB classifier is the best choice for the training courses domain, achieving an impressive F-Measure value of over 97%, despite it being trained with fewer samples than any of the classification systems we have encountered
Recommended from our members
DNA methylation-based biological age, genome-wide average DNA methylation, and conventional breast cancer risk factors.
DNA methylation-based biological age (DNAm age), as well as genome-wide average DNA methylation, have been reported to predict breast cancer risk. We aimed to investigate the associations between these DNA methylation-based risk factors and 18 conventional breast cancer risk factors for disease-free women. A sample of 479 individuals from the Australian Mammographic Density Twins and Sisters was used for discovery, a sample of 3354 individuals from the Melbourne Collaborative Cohort Study was used for replication, and meta-analyses pooling results from the two studies were conducted. DNAm age based on three epigenetic clocks (Hannum, Horvath and Levine) and genome-wide average DNA methylation were calculated using the HumanMethylation 450 K BeadChip assay data. The DNAm age measures were positively associated with body mass index (BMI), smoking, alcohol drinking and age at menarche (all nominal P < 0.05). Genome-wide average DNA methylation was negatively associated with smoking and number of live births, and positively associated with age at first live birth (all nominal P < 0.05). The association of DNAm age with BMI was also evident in within-twin-pair analyses that control for familial factors. This study suggests that some lifestyle and hormonal risk factors are associated with these DNA methylation-based breast cancer risk factors, and the observed associations are unlikely to be due to familial confounding but are likely causal. DNA methylation-based risk factors could interplay with conventional risk factors in modifying breast cancer risk
Making Big Data Useful for Health Care: A Summary of the Inaugural MIT Critical Data Conference
With growing concerns that big data will only augment the problem of unreliable research, the Laboratory of Computational Physiology at the Massachusetts Institute of Technology organized the Critical Data Conference in January 2014. Thought leaders from academia, government, and industry across disciplines--including clinical medicine, computer science, public health, informatics, biomedical research, health technology, statistics, and epidemiology--gathered and discussed the pitfalls and challenges of big data in health care. The key message from the conference is that the value of large amounts of data hinges on the ability of researchers to share data, methodologies, and findings in an open setting. If empirical value is to be from the analysis of retrospective data, groups must continuously work together on similar problems to create more effective peer review. This will lead to improvement in methodology and quality, with each iteration of analysis resulting in more reliability
Subtidal macrozoobenthos communities from northern Chile during and post El Niño 1997–1998
Despite a large amount of climatic and oceanographic information dealing with the recurring climate phenomenon El Niño (EN) and its well known impact on diversity of marine benthic communities, most published data are rather descriptive and consequently our understanding of the underlying mechanisms and processes that drive community structure during EN are still very scarce. In this study, we address two questions on the effects of EN on macrozoobenthic communities: (1) how does EN affect species diversity of the communities in northern Chile? and (2) is EN a phenomenon that restarts community assembling processes by affecting species interactions in northern Chile? To answer these questions, we compared species diversity and co-occurrence patterns of soft-bottoms macrozoobenthos communities from the continental shelf off northern Chile during (March 1998) and after (September 1998) the strong EN event 1997–1998. The methods used varied from species diversity and species co-occurrence analyses to multivariate ordination methods.
Our results indicate that EN positively affects diversity of macrozoobenthos communities in the study area, increasing the species richness and diversity and decreasing the species dominance. EN represents a strong disturbance that affects species interactions that rule the species assembling processes in shallow-water, sea-bottom environments
The Chandra Source Catalog
The Chandra Source Catalog (CSC) is a general purpose virtual X-ray
astrophysics facility that provides access to a carefully selected set of
generally useful quantities for individual X-ray sources, and is designed to
satisfy the needs of a broad-based group of scientists, including those who may
be less familiar with astronomical data analysis in the X-ray regime. The first
release of the CSC includes information about 94,676 distinct X-ray sources
detected in a subset of public ACIS imaging observations from roughly the first
eight years of the Chandra mission. This release of the catalog includes point
and compact sources with observed spatial extents <~ 30''. The catalog (1)
provides access to the best estimates of the X-ray source properties for
detected sources, with good scientific fidelity, and directly supports
scientific analysis using the individual source data; (2) facilitates analysis
of a wide range of statistical properties for classes of X-ray sources; and (3)
provides efficient access to calibrated observational data and ancillary data
products for individual X-ray sources, so that users can perform detailed
further analysis using existing tools. The catalog includes real X-ray sources
detected with flux estimates that are at least 3 times their estimated 1 sigma
uncertainties in at least one energy band, while maintaining the number of
spurious sources at a level of <~ 1 false source per field for a 100 ks
observation. For each detected source, the CSC provides commonly tabulated
quantities, including source position, extent, multi-band fluxes, hardness
ratios, and variability statistics, derived from the observations in which the
source is detected. In addition to these traditional catalog elements, for each
X-ray source the CSC includes an extensive set of file-based data products that
can be manipulated interactively.Comment: To appear in The Astrophysical Journal Supplement Series, 53 pages,
27 figure
From 10 Kelvin to 10 TeraKelvin: Insights on the Interaction Between Cosmic Rays and Gas in Starbursts
Recent work has both illuminated and mystified our attempts to understand
cosmic rays (CRs) in starburst galaxies. I discuss my new research exploring
how CRs interact with the ISM in starbursts. Molecular clouds provide targets
for CR protons to produce pionic gamma rays and ionization, but those same
losses may shield the cloud interiors. In the densest molecular clouds, gamma
rays and Al-26 decay can provide ionization, at rates up to those in Milky Way
molecular clouds. I then consider the free-free absorption of low frequency
radio emission from starbursts, which I argue arises from many small, discrete
H II regions rather than from a "uniform slab" of ionized gas, whereas
synchrotron emission arises outside them. Finally, noting that the hot
superwind gas phase fills most of the volume of starbursts, I suggest that it
has turbulent-driven magnetic fields powered by supernovae, and that this phase
is where most synchrotron emission arises. I show how such a scenario could
explain the far-infrared radio correlation, in context of my previous work. A
big issue is that radio and gamma-ray observations imply CRs also must interact
with dense gas. Understanding how this happens requires a more advanced
understanding of turbulence and CR propagation.Comment: Conference proceedings for "Cosmic-ray induced phenomenology in
star-forming environments: Proceedings of the 2nd Session of the Sant Cugat
Forum of Astrophysics" (April 16-19, 2012). 16 pages, 5 figure
Science and Ideology in Economic, Political, and Social Thought
This paper has two sources: One is my own research in three broad areas: business cycles, economic measurement and social choice. In all of these fields I attempted to apply the basic precepts of the scientific method as it is understood in the natural sciences. I found that my effort at using natural science methods in economics was met with little understanding and often considerable hostility. I found economics to be driven less by common sense and empirical evidence, then by various ideologies that exhibited either a political or a methodological bias, or both. This brings me to the second source: Several books have appeared recently that describe in historical terms the ideological forces that have shaped either the direct areas in which I worked, or a broader background. These books taught me that the ideological forces in the social sciences are even stronger than I imagined on the basis of my own experiences.
The scientific method is the antipode to ideology. I feel that the scientific work that I have done on specific, long standing and fundamental problems in economics and political science have given me additional insights into the destructive role of ideology beyond the history of thought orientation of the works I will be discussing
Evaluation Research and Institutional Pressures: Challenges in Public-Nonprofit Contracting
This article examines the connection between program evaluation research and decision-making by public managers. Drawing on neo-institutional theory, a framework is presented for diagnosing the pressures and conditions that lead alternatively toward or away the rational use of evaluation research. Three cases of public-nonprofit contracting for the delivery of major programs are presented to clarify the way coercive, mimetic, and normative pressures interfere with a sound connection being made between research and implementation. The article concludes by considering how public managers can respond to the isomorphic pressures in their environment that make it hard to act on data relating to program performance.This publication is Hauser Center Working Paper No. 23. The Hauser Center Working Paper Series was launched during the summer of 2000. The Series enables the Hauser Center to share with a broad audience important works-in-progress written by Hauser Center scholars and researchers
- …